April 30, 2018

Overview of Work/Research

  • Neuroimaging and R (Neuroconductor Project)
  • R Package Development/“Data Science”
  • Segmentation/Classification of:
    • White Matter Lesions in Multiple Sclerosis
    • Brain vs. Skull (CT)
    • Brain Hemorrhage/Stroke (CT)

Overview of Work/Research

  • Neuroimaging and R (Neuroconductor Project)
  • R Package Development
  • Segmentation/Classification of:
    • White Matter Lesions in Multiple Sclerosis
    • Brain vs. Skull (CT)
    • Brain Hemorrhage/Stroke (CT)

Brain Image Processing in R

Workflow for an Analysis

  • bash flow
  • FSL flow
  • ANTs flow
  • MRIcroGL flow
  • OsiriX flow
  • SPM 12 flow
flow

Workflow for an Analysis

Multiple pieces of software used

  • all different syntax
flow

Goal:

Lower the bar to entry

  • all R code
    • pipeline tool
    • “native” R code

Complete pipeline

  • preprocessing and analysis
flow

What did medical imaging in R have?

flow

Bioinformatics Repository: Bioconductor
flow

Bioinformatics Repository: Bioconductor
flow

  • centralized bioinformatics/genomics packages
  • large community/number of packages (> 1300)
  • published tutorials and workflows
  • additional requirements to CRAN (e.g. packages need vignettes)

flow
An R Platform for
Medical Imaging Analysis
(Muschelli et al. 2018)

https://neuroconductor.org/ flow

Authored R Packages:

  • fslr

    (Muschelli, John, et al. “fslr: Connecting the FSL Software with R.” R JOURNAL 7.1 (2015): 163-175.)

  • brainR

    (Muschelli, John, Elizabeth Sweeney, and Ciprian Crainiceanu. “brainR: Interactive 3 and 4D Images of High Resolution Neuroimage Data.” R JOURNAL 6.1 (2014): 42-48.)

  • ichseg

    Muschelli, John, et al. “PItcHPERFeCT: Primary intracranial hemorrhage probability estimation using random forests on CT.” NeuroImage: Clinical 14 (2017): 379-390.

  • extrantsr
  • dcm2niir
  • matlabr
  • spm12r
  • freesurfer
  • itksnapr
  • stapler
  • gifti
  • cifti
  • papayar
  • diffr
  • gcite
  • rscopus
  • fedreporter
  • glassdoor

Number of Downloads (from cranlogs)

Lesion Segmentation of MS

Public Dataset with Lesion Segmentation

Demographic Data

  • On many different therapies (9 no therapy), age IQR: 33 - 42
Variable Overall
n 30
Age (mean (sd)) 39.27 (10.12)
sex = M (%) 7 (23.3)
EDSS (mean (sd)) 2.61 (1.88)
Lesion_Volume (mean (sd)) 17.40 (16.13)
MS_Subtype (%)
Clinically Isolated Syndrome 2 (6.7)
Progressive-relapsing 1 (3.3)
Relapsing-remitting 24 (80.0)
Secondary-progressive 2 (6.7)
Unspecified 1 (3.3)

Imaging Data

  • 2D T1 (TR=2000ms, TE=20ms, TI=800ms) and after gadolinium
  • 2D T2 (TR=6000ms, TE=120ms), 3D FLAIR (TR=5000ms, TE=392ms, TI=1800 ms)
    • Fluid attenuated inversion recovery - reduce signal of fluids
  • All had flip angle of 120\(^{\circ}\)

OVERLAY

Terminology: Neuroimaging to Data/Statistics

  • Segmentation ⇔ classification
  • Image ⇔ 3-dimensional array
  • Mask/Region of Interest ⇔ binary (0/1) image
  • Registration ⇔ Spatial Normalization/Standarization
    • “Lining up” Brains

Image Representation: voxels (3D pixels)

Step 1: Image Processing

Step 1: Image Processing

Figure from Multi-Atlas Skull Stripping method paper (Doshi et al. 2013):

  • Register templates to an image using the T1 for that subject
  • Apply transformation to the label/mask
  • Average each voxel over all templates
    • there are “smarter” (e.g. weighted) ways

Step 2: Create Predictors for each Sequence

Preds

  • Quantile images, smoothers, local moments
  • Tissue class probability
  • Z-score to a population template

A package to do all this

  • All processing located in smri.process GitHub package (muschellij2/smri.process)

code

Data Structure for One Patient
Vox stack

Step 3: Aggregate Data

Training Data Structure

  • Sample 10% of the voxels (save computation time)
  • Stack together 14 randomly selected patients, stratified by age (over median) and volume
  • Train model/classifier on this design matrix
  • Smooth the probability map
  • Test on 16 hold out
MISTIE LOGO

Step 4: Fit Models / Classifier

Let \(y_{i}(v)\) be the presence / absence of lesion for voxel \(v\) from person \(i\).

General model form: \[ P(Y_{i}(v) = 1) \propto f(X_{i}(v)) \] - Similar thinking in OASIS: logistic regression with images + interaction of the image and a 10mm\(^3\) and 20mm\(^3\) smoother (Sweeney et al. 2013). \[ f(X_{i}(v)) = \text{expit} \left\{ \beta_0 + \sum_{k \in \{T1, T2, FLAIR, PD\}} x_{k}(v)\beta_{k} + x_{k}(v) * x_{10, k} \beta_{10,k} + x_{k}(v)* x_{20, k} \beta_{20,k}\right\} \] Does not include T1Post, but did include PD originally - With the original model from the paper and a re-trained model

Models Fit on the Training Data

  • Logistic Regression: \(f(X_{i}(v)) = \text{expit} \left\{ \beta_0 + \sum_{k= 1}^{p} x_{i, k}(v)\beta_{k}\right\}\), \(p = 85\)
  • Random Forests (Wright and Ziegler 2017), (Breiman 2001)
    • With 5 fold cross-validation, default 500 trees, mtry: \(\sqrt{p}\)
    • With and without the T1-Post for comparison to OASIS
      \(f(X_{i}(v)) \propto\) RF
  • Estimate a probability cutoff on training data
  • Predict on test data, assess performance acrosss all voxels in the brain

Assessing Performance

For each test scan, and over all test scans, we can calculate the following 2-by-2 table, where the cells represent number of voxels and a corresponding Venn diagram:

Manual
0 1
PitCH 0 TN FN
1 FP TP

Dice Results (Triangle is population Dice) Reseg

Reseg

RF Predicted Volume Estimates True Volume Reseg

OASIS: not so much Reseg

Patient with Median DSI (0.63) in Test

Median

Median

Patient with High DSI (0.73) in Test

Median

Median

Brain Stem Lesions Estimated

Median

Median

Conclusions of Lesion Analyses

  • We can segment MS lesions reasonably well

  • Better models with larger samples

  • Needs to be more stable/accurate for a biomarker
    • Location may also be relevant and not taken into account
    • Is the brain stem an area we should focus on or remove from assessment?

Next Steps/Questions

  • Run new processing the 131 patients from OASIS paper
  • Gray matter injury estimation
  • Is EDSS the clinical score we should be relating this to?
  • “Black hole” lesions using the T1-post image, these may show “active” lesions

Thank You

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1). Springer:5–32.

Doshi, Jimit, Guray Erus, Yangming Ou, Bilwaj Gaonkar, and Christos Davatzikos. 2013. “Multi-Atlas Skull-Stripping.” Academic Radiology 20 (12). Elsevier:1566–76.

Lesjak, Žiga, Alfiia Galimzianova, Aleš Koren, Matej Lukin, Franjo Pernuš, Boštjan Likar, and Žiga Špiclin. 2018. “A Novel Public MR Image Dataset of Multiple Sclerosis Patients with Lesion Segmentations Based on Multi-Rater Consensus.” Neuroinformatics 16 (1). Springer:51–63.

Muschelli, John, Adrian Gherman, Jean-Philippe Fortin, Brian Avants, Brandon Whitcher, Jonathan D Clayden, Brian S Caffo, and Ciprian M Crainiceanu. 2018. “Neuroconductor: An R Platform for Medical Imaging Analysis.” Biostatistics.

Sweeney, Elizabeth M, Russell T Shinohara, Navid Shiee, Farrah J Mateen, Avni A Chudgar, Jennifer L Cuzzocreo, Peter A Calabresi, Dzung L Pham, Daniel S Reich, and Ciprian M Crainiceanu. 2013. “OASIS Is Automated Statistical Inference for Segmentation, with Applications to Multiple Sclerosis Lesion Segmentation in MRI.” NeuroImage: Clinical 2. Elsevier:402–13.

Wright, Marvin N., and Andreas Ziegler. 2017. “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software 77 (1):1–17. https://doi.org/10.18637/jss.v077.i01.